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ABSTRACT 

We present an algorithm for simulating the equations of ideal magnetohydrodynamics and other 
systems of differential equations on an unstructured set of points represented by sample particles. The 
particles move with the fluid, so the time step is not limited by the Eulerian Courant-Friedrichs-Lewy 
condition. Full spatial adaptivity is required to ensure the particles fill the computational volume, and 
gives the algorithm substantial flexibility and power. A target resolution is specified for each point in 
space, with particles being added and deleted as needed to meet this target. We have parallelized the 
code by adapting the framework provided by GADGET-2. A set of standard test problems, including 
10~ 6 amplitude linear MHD waves, magnetized shock tubes, and Kelvin-Helmholtz instabilities is 
presented. Finally we demonstrate good agreement with analytic predictions of linear growth rates for 
magnetorotational instability in a cylindrical geometry. This paper documents the Phurbas algorithm 
as implemented in Phurbas version 1.1. 

Subject headings: Magnetohydrodynamics (MHD), Methods: numerical, Hydrodynamics 



1. INTRODUCTION 



In Maron et al. (20121 hereafter Paper I) we described 
an adaptive, Lagrangian, meshless, method for magne- 
tohydrodynamics (MHD). We now describe the parallel 
implementation of this algorithm and its tests, and dis- 
cuss its practical properties. The test problems used are 
selected from the accepted ones in th e literature; many 
follow those docu mented for A thena 3 ( Stone et al.[2 008), 
and those used b y|T6th| ( [2000] ), and ]Ryu~S~Jones||l995| . 
The three goals of the tests are to verify the convergence 
to smooth solutions of the MHD equations, demonstrate 
the global conservation properties and shock errors, and 
finally to verify the ability of Phurbas to model the linear 
growth of m agnetorotational instability correctly. 

In § 2.1 we describe how we have implemented our 



algorithm into a parallel code called Phurbas using the 
GADGET-2 codeQ ( |Springel|[2005l ) as a basis. We de- 
scribe the tests that we have performed with this im- 
plementation in § [3] We summarize the results of the 
tests, and discuss the implications and future prospects 
for Phurbas-like methods in § [4] 

2. IMPLEMENTATION OF THE ALGORITHM 

Phurbas is a parallel code using the Message Pass- 
ing Inter face , following the patterns of the GADGET- 
2 code \ Springe! 2005), which originally combined a 
smoothed particle hydrodynamics (SPH) treatment of 
gas dynamics with a tree solution of the Poisson equa- 
tion to follow N-body dynamics. The serial particle up- 
date module described in Paper I has been incorporated 
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into a modified version of GADGET-2 that has been 
re-purposed to serve as a parallel framework. We re- 
fer to the version documented in this work as Phurbas 
version 1.1 to allow for the differentiation of future mod- 
ifications to the algorithm and the code. (We note that 
tests of Phurbas 1.0 were described in the first submitted 
version of this paper, posted to ArXiv as version 1 of this 
paper.) 

We summarize here the algorithm described in detail 
in Paper I. Phurbas solves the equations of ideal MHD, 
expressed in terms of Lagrangian time derivatives. The 
equations are discretized on an adaptive set of particles 
that carry values of the field variables (density p, velocity 
V, magnetic field B, and internal energy density a). The 
particles move with the velocity V that they carry, mak- 
ing the Lagrangian formalism the natural description of 
the time evolution of the field variables. The evolution 
equations relate time derivatives of the field variables to 
their spatial derivatives. To calculate the spatial deriva- 
tives, Phurbas uses a local, third-order, polynomial, mov- 
ing least squares interpolation, using neighbor particle 
values drawn from a sphere of radius r/,j = 2.3A^ about 
particle i, where is the effective resolution parameter. 
A second order predictor-corrector scheme is used with 
the time derivatives obtained by applying the MHD equa- 
tions to the approximate spatial derivatives to advance 
the particle positions and variable values. 

Particles are added and deleted, filling voids and de- 
stroying clumps, to ensure that each sphere of radius r f is 
well sampled. On average, the particle creation and dele- 
tion results in at least one particle per volume A 3 . The 
resolution parameter A need not be constant in space or 
time, and each particle has an individual time step. In 
this sense Phurbas is fully adaptive both spatially and 
temporally. The time steps are independent of the bulk 
velocity of the flow. Spatial variation of the time steps 
is limited to prevent the penetration of short time step 
particles into regions of long time step particles. 

To obtain numerical stability, Phurbas integrates a 
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modified version of the ideal MHD equations including 
a bulk viscosity. The bulk viscosity coefficient is broken 
into two scalar fields. The first is scaled so that mo- 
tions near the grid scale are always damped. The second 
adapts to provide damping in regions with large change, 
such as shocks and contact discontinuities. A diffusive 
correction term in the induction equation locally diffuses 
nonzero V • B away. 

2.1. GADGET-2 Based Implementation 

In this section we outline the changes made to 
GADGET-2 to adapt it to Phurbas, which serves also 
to highlight the differences in the infrastructure needed 
to support SPH and Phurbas. The largest modifications 
are the removal of the SPH algorithm for evaluating the 
spatial derivatives, the addition of data passing and el- 
ements in the data structures needed for the Phurbas 
MHD algorithm, the modification to support addition 
and deletion of gas-type particles, the addition of mag- 
netic field variables, and a new main time advance loop. 
Additions to the input-output routines and global statis- 
tics were also made. 

In GADGET-2 SPH, the sums required for value and 
gradient evaluation are computed in a distributed man- 
ner. The contribution to each sum from a set of neigh- 
bor particles can be computed on the processor where 
those neighbors reside, so that only the total value need 
be communicated. For the Phurbas particle update, the 
values for all particles within r/^ of particle i must be 
communicated to the process hosting particle i. Particles 
are updated in batches, so within groups of ~ 5, 000- 
10,000 particles duplicate neighbor communications can 
be avoided. 

Phurbas requires only a strict radius-based neighbor 
search, unlike in GADGET-2 SPH where a mutual 
neighbor relation is required to achieve symmetric in- 
terparticle forces. The Phurbas neighbor search is also 
performed on a fixed radius volume, so it does not need 
iterative refinement like the GADGET-2 SPH neighbor 
search, which adapts the neighbor search radius to in- 
clude a target number of neighbors. In Phurbas, the mov- 
ing least squares interpolation and the particle adaption 
algorithm require information from the neighbor parti- 
cles of particle i to be retrieved to the host process of 
particle i. The set of particles used for these three pro- 
cedures is the same, the particles within Tf. 

Compared to GADGET-2 SPH particles, Phurbas 
particles use somewhat more memory. Phurbas uses 56.5 
eight byte variables per particle as opposed to the 36.5 
used by GADGET-2 in double precision mode. We 
retain the basic structures of GADGET-2 in that the 
variables required for the neighbor tree construction are 
stored in a separate array from the fluid quantities. As 
the memory required to complete the retrieval of neigh- 
bor information can be much larger than in GADGET- 
2 SPH, and dynamically varies, we dynamically allo- 
cate buffers as needed to complete this step and free 
the memory after the process is finished. Compared to 
GADGET-2 SPH, we have modified the code to avoid 
persistent allocation of memory, which in particular al- 
leviates problems with computing the domain decompo- 
sition with large numbers of parallel processes. 

The void search in Phurbas described in Paper I relies 
on computing distances to neighbor particles placed on 



a 9 3 grid. This procedure gains in computational speed 
if the three-dimensional grid is implemented as a one- 
dimensional list in memory that is shortened each time 
a grid point is eliminated from consideration as a near- 
est neighbor. This minimizes the number of times each 
grid point must be accessed, while keeping the values 
arranged compactly in memory. We have implemented 
this strategy by simply replacing the coordinates of any 
eliminated point with the coordinates of the current last 
point on the list and shortening the length of the list. 

GADGET-2 does not support dynamic particle addi- 
tion and deletion. In Phurbas, we first perform a load 
balance on the list of particles to add to voids, in order 
to distribute them among processes evenly. The parti- 
cles are created in free memory spaces on the processors 
where they are moved to by the load balance algorithm. 
Particles in clumps are deleted from the processors they 
are resident on. 

As we have now deleted some particles, and added oth- 
ers to essentially random processors, we perform a new 
global GADGET-2 load balance calculation, followed by 
a Peano-Hilbcrt ordering and tree build, as in the stan- 
dard GADGET-2 algorithm ( |Springel|2005| ) . Doing this 
on the entire particle list brings the new particles to op- 
timal positions on the processors and provides neighbor 
information for the subsequent processing stages. 

The restriction of local time step variations only re- 
quires information from particle i to be sent to the host 
processes of the neighbors of particle i.Time steps are 
rounded down to powers of two in order to use the 
GAD GET-2 binary block time step scheme (Springel 



2005 1 , and the increase in time step is limited with the 
GADGET-2 synchronized time step scheme. 

The most significant optimization made has been for 
the assembly of the left hand side matrix for the least 
squares fitting problem used in the polynomial interpo- 
lation. This has been explicitly loop-unrolled into very 
simple FORTRAN designed to maximize the ability of 
the compiler to pipeline the instructions and optimize 
cache use. Despite the increased memory use compared 
to GADGET-2 SPH, our experience has been that sim- 
ulations are computation limited rather than memory 
limited. 

Compared to dark matter dominated, cosmological, N- 
body problems, pure MHD fluid problems intrinsically 
require more work per particle per time step. For the 
highest resolution run in the cylindrical geometry MRI 
test in the following section, we used ~ 10 6 particles 
on 48 cores, 95% of which were typically active, non- 
boundary particles. This was run on the Texas Advanced 
Computation Center Lonestar cluster, which consists of 
nodes with dual Xeon Intel, hexacore, 3.33 GHz, 64-bit, 
Westmere processors (13.3 GFLOPS per core) intercon- 
nected with InfiniBand QDR. Phurbas took an average 
~ 1.3 x 10~ 4 seconds for the serial procedures per particle 
(void and clump checks, moving least squares interpola- 
tions, time derivative calculations, and time integration 
corrector step). We note that the performance of this 
section of the code depends highly on the compiler and 
optimization settings used. The wallclock time per step 
was ~ 5 seconds for a step involving ~ 9.1 x 10 5 active 
particles, giving a speed of ~ 2.6 x 10 -4 seconds per par- 
ticle or 3.8 x 10 3 total updates per core per wallclock 
second, including adaptivity involving ~ 1500 additions 
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and deletions. It should be noted that Phurbas version 
1.1 code has not yet been heavily optimized, so we believe 
the parallel overhead can be further reduced. 

For a given application to be suitable for simulation 
with Phurbas, the Lagrangian nature, adaptivity, and 
individual time steps of Phurbas must offer a significant 
advantage on a problem. Otherwise a high-order, grid- 
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3. TESTS 



We present in this section a series of tests that serve 
to verify the performance of Phurbas, and illustrate its 
abilities and limitations. These include linear waves 



(Sec. 3.1), circularly polarized Alfven waves (Sec. 3.2 ), 

hydrodynamical and MHD shock tubes (Sec. 3~73~), 

Kelvin-Hehnholtz instabilities (Sec. 3.4), and cylindrical 
magnetorotational instability (Sec. 575). 

As Phurbas is meshless, it is an intrinsically three- 
dimensional algorithm. The performance and design cri- 
teria arc different in different numbers of dimensions. 
Hence, even when tests have symmetries such that they 
could be stated in a lower number of dimensions, we per- 
form them in fully three-dimensional domains to achieve 
results reflective of the true capabilities of the code. 

It is important to use realistic particle distributions for 
these tests. Though perfectly gridded particle distribu- 
tions would yield good results for some tests, as has been 
demonstrated in the literature, such well arranged distri- 
butions cannot be expected in a general flow. Instead, 
we realize cubes of relaxed, glassy particle distributions 
using several iterations of Phurbas's void and clump de- 
tection algorithms, and then tile these distributions to 
fill the problem domain. This also ensures that initially, 
no particle adaption occurs until the particles move far 
enough to justify it. This allows both the use of a real- 
istic particle distribution and makes it possible to create 
a set of predictably refined initial conditions for conver- 
gence studies. Irregular initial particle distributions also 
remove the need to set up tests at odd angles to prevent 
aligning waves and shock fronts with a grid. 

3.1. Linear Waves 

To verify the convergence of the scheme and its accu- 
racy in the linear domain, three MHD waves are modeled. 
The three waves are a magnetoacoustic wave propagat- 
ing perpendicularly to the background magnetic field, 
a shear Alfven wave propagating along the background 
field, and a sound wave propagating in an unmagnetized 
medium. We compare the results with solutions of the 
dispersion relation for the MHD equation, including the 
bulk viscosity Q. The solutions for the magnetoacoustic 
and sound waves are derived in Appendix [A"| 

To set this problem up, we use a cubic domain, with 
side length 1.0, is initialized with density p = 1, pres- 
sure P — I/7, and adiabatic index 7 = 5/3, in which 
the waves propagate in the ^-direction. For the mag- 
netoacoustic wave the magnetic field is B = ^/3z, for 
the Alfven wave it is B = l.Ox, and for the sound wave 
the magnetic field is set to zero. The initial condition is 
generated from a relaxed particle distribution so that no 
particle deletion or addition initially occur. For runs with 
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Figure 1. Convergence of linear waves. Points show the relative 
Li-norm errors derived from the comparison of Phurbas results to 
solutions of the MHD dispersion relation including the stabilizing 
bulk viscosity (see Appendix [A| . The slopes plotted show second- 
order convergence. 




Figure 2. Circularly polarized Alfven waves after five box cross- 
ings with resolutions given by the legend. Note the rapid conver- 
gence of the higher resolution models. 

resolution higher than the lowest value, the same set of 
particles used in the lowest resolution test is scaled and 
tiled to fill the computational volume. The wave is prop- 
agated one wavelength, and then the Li-norm error is 
summed across all fields. We analyze the convergence of 
the relative L\ error, dividing the absolute error by the 
amplitude given by the linear analysis in Appendix [A] 
as for the waves with a compressive nature (sound and 
magnetoacoustic) , the analytical solution varies with res- 
olution. All the waves are observed to converge with at 
least second order accuracy, as shown in Figure [T] 

3.2. Circularly Polarized Alfven Wave 

For our next test of MHD, we run finite-amplitude, cir- 
cularly po larized , Alfv en plane waves using the param- 
eters from Toth ( 2000 ) in a cubic domain with periodic 
boundary conditions, with the propagation direction par- 
allel to th e x-axis. Th is is equivalent to a = in the for- 
malism of T6th| (2000 ). Specifically, the initial conditions 
are p = 1.07/1^^11707 B x = 1.0, V y = 0.1sin(27ra) = B y , 
V z = 0.1cos(2ttx) = B z , P = 0.1, and 7 = 5/3. The 
wave is propagated five wavelengths, returning to its orig- 
inal position. The solutions shown in Figure [2] show that 
at low resolution, the wave is strongly damped, while as 
resolution increases the wave rapidly converges. Figure 
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possible MHD discontinuities. Ryu & Jones (1995) in 
their Figure 4d show a test (denoted RJ 4d) of other d is- 



Figure 3. Li-norm errors for circularly polarized Alfven waves 
run with resolution A after five box crossings. The plotted reference 
slope is 2. 

[3] shows the L\ norm errors, which converge faster than 
second order. 

3.3. Shock Tubes 

To test performance on supersonic and super- Alfvenic 
problems, we set up several shock tube problems. We 
used the parameters given in Table [l] on long rectangular 
domains with periodic boundaries.^ spatially constant 
resolution criterion A allows comparison to grid based 
codes, while an appropriately density dependent reso- 
lution criterion allows comparison to other Lagrangian 
methods. For the constant A tests we denote the dis- 
tance along the long axis of the domain in units of A. 

3.3.1. Sod Shock Tube 



We first run the classic Sod ( 1978 ) shock tube, in gas 
with 7 = 7/5 on a domain of size 64 x 1 x 1 units. A 
smooth transition between the left and right states with 
a width of 0.3 units is described with a fifth-order spline. 
In the first version of this test we use a refinement criteria 
A = 0.125/9 1//3 that yields a constant mass per resolution 
element (that is, per region with volume of 4/37rA 3 ). This 
is in effect the resolution criterion used by SPH. In Fig- 
ure0]the solution at time t — 13.8 is shown. For this test 
the x axis on the plot is denoted in units of A in the left 
initial state. Unlike SPH, Phurbas supports more gen- 
eral refinement criteria. As a simple example, we ran the 
same Sod test problem with spatially constant resolution 
A = 0.125. Figure [5] shows the result at time t = 15. The 
result is generally very similar to the result with mass- 
based refinement, with the main change being that the 
shock is thinner, as the local resolution is higher in the 
constant-refinement case. In either case, the shock speed 
is reasonably well reproduced. 

3.3.2. MHD Shock Tubes 

We next perform a suite of MHD shock tube tests, 
selecting from the large set of standard MHD Ricmann 
problems used in the literature . To tabulate refer ence so- 
lutions, we have used Athena ( Stone et al.|2008"| at high 
resolutio n in I D. The first test we show is tire classical 
Brio- Wu [1988 shock tube test, which is a standard prob- 
lem, though hot particularly stri ngent. The test shown in 
Figure 2a of Ryu & Jones ( 1995 denoted RJ2a) provides 
a more complete test of the appearance of the various 



continuities. The problem described by Falle 
his Figure 6 (denoted F6) is specifically used 



(2002) in 
o demon- 



strate shock errors in non-locally-conservative methods. 
Finall y, the test shown in Figure 6 of |Dai fc Woodwardl 



( |1994| , as well as in Figure la of Ryu & Jones (1995 



denoted RJla) is commonly used as a stringent test o 
V • B errors in shocks, and in the case of Phurbas demon- 
strates the effects of local non- conservation errors asso- 
ciated with strong MHD shocks. Note that with the ex- 
ception of the Brio-Wu test, all these MHD shock tube 
tests use an adiabatic gas with 7 = 5/3. 



Brio-Wu— The Brio-Wu [1988] shock tube (see Table [IT 
is an MHD analog to the Sod shock tube problem. We 
set up this test in gas with 7 — 2, on a fully periodic 
domain of 128 x 1 x 1 units, with constant resolution 
A = 0.125. A smooth transition between the left and 
right states with a width of 3 units was produced with a 
cosine function. The width of the transition region was 
chosen to avoid excessive start-up transients. 

As the problem was run in two mirror images in a pe- 
riodic volume, only half of the periodic volume used is 
shown in Figure [6j with the x-axis labeled in A units, 
at time t = 7.3. The solution captures the fundamental 
features of the problem, including, from left to right, the 
rarefaction fan, the compound wave, the contact discon- 
tinuity, the slow shock, and the fast rarefaction wave. 

The final panel of Figure [6] shows a measure of the 
effect of V-B. Here the quantity \(V ■ B)/\B\ is the frac- 
tional magnetic field error on the scale of A. Grey points 
in this panel show the raw values of this quantity, with 
maximum magnitude 10~ 3 . However, the scatter is very 
symmetric, indicating that most of it can be attributed 
to the truncation error in the approximation of V • B it- 
self. To extract the coherent skew from zero, we plot the 
data binned in bins with width A as the black step curve, 
demonstrating that the normalized V B errors are less 
than 10" 4 . 

The Athena solution that we compare our result to, 
as well as the usually accepted numerical solutions to 
the Brio-Wu problem, show a compound wave structure, 
seen at x ~ —25 in Figure [6} This struct ure should 
not formally exist ( Falle & Komissarov 2001 ) , but most 
multidimensional numerical methods, including Phurbas, 
show it as part of the solution. 



RJ2a — The te st shown in Figure 2a of [Ryu fc Jones 



( p95| (also see [ Dai & Woodward| ( fl994| ) Table 3a) is 
shown in Figure [7] at time t = 12. Initial conditions for 
this test are listedin Table [T] as RJ2a. This test is other- 
wise set up in the same manner and on the same domain 
as the Brio-Wu test. A high resolution solution was com- 
puted with Athena for comparison. As |Stone et al. ( 2008 \ 
point out, this test is particularly interesting because it 
requires modeling a fast magnetosonic shock and a rota- 
tional discontinuity in each direction of propagation, as 
well as a contact discontinuity in the center. No visible 
ringing is seen at the shocks. The largest coherent V B 
errors are also seen near the fast shocks, but the largest 
scatter in particle V B values is located at the contact 
discontinuity. 
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Shock Tube Left and Right States 
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Figure 4. Sod (1978) shock tube, run with mass-based refinement, equivalent to a Lagrangian method such as SPH. The rr-axis is denoted 
}f A of the initia 



in units of . 



initial left state density. Phurbas result in black, reference solution in gray. 



RJjd — The test shown in Figure 4d of |Ryu fe; Jones| 
(1995) is shown in Figure [8] at time t = 11.5. This test 
was run with the same computational domain and res- 
olution as the previous test but with the left and right 
states listed in Table Q] as test RJ4d. A small overshoot 
can be seen on the leftmost rarefaction wave. This is a 
result of the bulk viscosity of the scheme. 



F6— The test shown by ]Mle| (|2002f in his Figure 6 is 
shown in Figure [9] at time z 



9. 



"This test was use d to 
show an error in the ZEUS ( Stone fc Norman||1992 1 so- 



lution. The initial states are listed in Table [T] as test F6 
and the problem was run in a domain of 64 x 1 x 1 units 



with A = 0.25. Fixed boundaries were used in the x di- 
rection and periodic boundaries in the y and z directions. 
Compared t o the nonconservative ZEUS result shown in 



Falle (20021) the slow shock is captured more accurately. 
At tins resolution, the slow shock and the contact dis- 
continuity directly behind it have not yet separated, and 
the bulk viscosity of the scheme is causing an undershoot 
in density at the foot of the contact discontinuity. The 
left-going features also show overshoots in density and 
specific internal energy that are initial transients caused 
by the bulk viscosity. These effects are not however due 
to significant nonconservation errors. 



G 




Figure 6. Brio-Wu 1988 shock tube solution with Phurbas solution in black, and high-resolution Athena solution in gray. Lower right 
panel: One in ten unbinned values of V B as grey points, binned values as black steps. 
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Figure 7. |Ryu fc Jones| ( 1995 ) test from their Figure 2a (model RJ2a). High resolution Athena solutions are shown in gray, while the 
normalized V • B is shown as gray points, and the binned values are shown in black steps. One in ten particles is plotted in the V-B scatter. 




Figure 8. |Ryu fe Jones| | |1995| test from their Figure 4d (model RJ4d). High resolution Athena solution shown in gray, while the 
normalized V • ts is shown as gray points, and the binned values are shown in black steps. One in ten particles plotted in V B scatter. 
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Figure 9. Fallc (2002) test from their Figure 6 (our test F6), which was used to demonstrate an error in the ZEUS algorithm. High 
resolution Athena solution shown in gray, while the normalized Vi? is shown as gray points, and the binned values are shown in black 
steps. One in ten particles is plotted in V-B scatter. 
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Figure 10. Riemann problem test shown by lRyu fe Jones] l |1995|) in their Figure la, run without elliptic projection V-B correction. High 
resolution Athena solution shown in gray, while the normalized V B is shown as gray points, and the binned values are shown in black 
steps. One in ten particles is plotted in the V-B scatter. 
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This test also appears in Ryu fc Jones (1995 ) in their 
Figure la, in Table 6 of |T6th| ( |2UDg| 7anTm |Mignone 
et al. (2010). In the latter two It is used as a demonstra- 
tion of V • B errors. We show this problem here as a test 
of the technique used in Phurbas to handle V • B, though 
we expect that the strength of the shocks in this prob- 
lem lies outside of the intended use regime of Phurbas. 
A computational volume 64 x 1 x 1 units was used, with 
fixed value boundaries in the x direction, and periodic 
boundaries in the y and z directions with left and right 
states listed in Table [TJ as test RJla. As IDai fc Wood-1 
ward ( |1994[ ) show, the solution consists of a right going 
fast shock with Mach number 6.54, a left going fast shock 



with Mach number 2.54, a slow shock, a slow rarefaction, 
and a contact discontinuity. Phurbas shows a V • B error 
of a few parts in 10 in the region lying between the two 
fast shocks. 

3.4. Kelvin-Helmholtz Instability 

As an example of multidimensional smooth flow with 
bulk motions, we demonstrate the performance of Phur- 
bas on three-dimensional Kelvin-Helmholtz instability. 
Recently Kelvin-Helmholtz instability has attrac ted sig- 
nificant disc ussion following the demonstration by Agcrtz 
et al. ( 2007 ) that some SPH formulations fail to show the 
growth of the instability in a particular test problem. 
A number of works have discussed aspects of Kelvin- 
Helmholtz instability in numerical methods used in astro- 
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physics, particularly i n Lagrangian schemes ( Price||2008 



ICha et al. 2010 



Read et a 1. 2010 



Wadsley etaT] [20081 
2010; ISpringe l 2010a; 

Valcke et aI.|2010||Springel|2010bl|Rob"ertson eta 
Springel| |20ir ~ 



t 

b; 



tieii & Sp ringe 
unk et al. 2010 



2010 



In the termin ology of Robertson et al. (2010) and 
Springcl (2010b]) we use a smoothe d interla ce initial con- 
dition! However, unlike Springel (2010b) we choose a 



smoothing of the interface such that a closed-form alge- 
braic expression can be computed for the first order, lin- 
ear, perturbation theory prediction for the growth rate in 
an inc ompressible flow on an infinite domain. [Wang et al.| 
(2010) have derived such an analytic treatment, provid- 



ing an exact analytic benchmark for the first time. From 
that work, we select an initial condition with an expo- 
nential smoothing. The initial condition for the Kelvin- 
Hclmholtz test is as follows in coordinates where x is 
parallel to the direction of flow, and z is in the direction 
of slab symmetry. Density is given by 




6.0 0.2 0.4 0.6 0.8 1.0 



Pi - p m e^ if 1 > y > 

-y + l 

. P2 + p m e L if 2 > y > 1 

P — \ ~(3-y) .. „ „ 

P2 + p m e l if 3 > y > 2 

~t»-3) 

Pi - p m e t. it 4 > y > 3, 



(1) 



where p x = 1.0, p 2 = 1.1, p m = (l/2)(pi - p%), and L = 
0.025. The velocity field is given by a similar smoothed 
profile with a perturbation in the x direction 





- U m e 


y-i 

L 


u 2 


+ U m e 


-y+l 

L 


u 2 


+ U m e 


-(s-v) 

L 


Ui 


~ U m e 


-(y-3) 

L 



if 1 > y > 
if 2 > y > 1 
if 3 > y > 2 
if 4 > y > 3, 



(2) 



where U x = 0.5, U 2 = -0.5, U m 
perturbation in the y direction 



(l/2)(t/i-C/ 2 ), and a 



V y — Sv sin(47rx) < 



exp(47r(y — 1)) 
exp(47r(— y + 1)) 
exp(47r(-(3 - y)) 
exp(47r(-(t/ - 3)) 



if 1 > y > 

if 2 > y > 1 

if 3 > y > 2 

if 4 > y > 3, 



(3) 

where Sv = 0.01. Pressure is set to 2.5, and 7 = 5/3. 

To extract the growth of the velocity perturbation, we 
use a discrete convolution over the particles. We use this 
technique, as opposed to a discrete Fourier transform on 
gridded values, as the discrete convolution maps more di- 
rectly to the meshless Phurbas discretization. We define 
the magnitude of the unstable y-velocity mode 



M 



(4) 



Figure 11. Kelvin-Helmholtz instability test presented at the 
highest resolution used here (A = 1/256). A density slice is shown 
at t=2.52. 



where all the sums run over N particles, and 



Si 



V y sin(47rx) exp(- 


-4n\y - 


-1|) 


V y sin(47ra;) exp(- 


-4tt|(4 


-y)-i 


V v cos(47ra:) exp(- 


-47r|y - 


-i|) 


V v cos(47ra) exp(- 


-4tt|(4 




exp(-87r|y - 1|) 




ify<2 


exp(-87r|(4-2/) 


-1|) 


if y > 2. 



if y < 2 
if y > 2, 

if y<2 
if y > 2, 



(5) 

'(6) 
(7) 
(8) 

The domain is 4 x 1 units with thickness 1/32, 1/64, 
or 1/128. simulated with uniform target resolutions A = 
1/64, 1/128, and 1/256. 

In Figure [TT] the developed state of the highest resolu- 
tion run is shown, while Figure [12] shows the growth of 
the unstable y-direction velocity mode and the maximum 
y-direction kinetic e nergy. The linear p erturbation the- 
ory predictions from Wang et al. (2010) for the growth 
rate of the mode amplitude and the maximum y-direction 
kinetic energy are also plotted as a guide, but due to the 
periodic domain, compressibility, and finite perturbation 
magnitude we cannot expect the result to exactly follow 
this curve. However, clear convergence toward the linear 
theory can be seen as resolution increases. A more so- 
phisticated, multiple code verification study, circumvent- 
ing the limitations of the incompressible theory compari- 
son for a similar problem, will be presented in a separate 
publication. 

3.5. Magnetorotational Instability in a Cylinder 

The magne torotational instability (MRI; jBalbus fc 
Hawley|1991 ) is thought to be an important mechanism 
for driving turbulence in protoplanetary disks (IBalbus & 



Hawley 



1998) 



Floe 



We have p erformed a test similar to one 
( 2010 ) to examine the growth rate of 



c et al 



done by 

MRI during its linear phase. For this test, we remove the 
background Keplerian shear flow from the velocity field, 
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Figure 12. Left: Kelvin-Helmholtz instability mode growth for 
models with A = 1/64 (dashes), 1/128 (tight-spaced dashes), and 
1/256 units (dotted), compared to the predicted grow th rate (solid) 
from the linear, incompressible theory derived by Wang et al.| 
(2010). Right: Maximum y-direction kinetic energy for the same 
runs, again compared to the predicted growth rate from linear, 
incompressible theory. 



and evolve only the perturbations to the initial steady 
state velocity. That is, the velocity field we evolve and 
fit in this test is V 1 = V — Qrcf) where r is the cylin- 
drical radius from the point (0.5,0.5), 57 = r -15 is the 
Keplerian angular velocity and (j) is the unit vector in 
the azimuthal direction. Additionally, to ensure that the 
magnetic field configuration we study is consistent be- 
tween resolutions, we solve only for perturbations from 
the initial magnetic field configuration. Solving only for 
the perturbations is particularly useful here as the MRI 
grows fastest where the orbital timescale is smallest. 

The magnetic field configuration used in this test, an 
annulus of uniform vertical field, is chosen to artificially 
cut off the growth of MRI at a finite radius. Ensuring 
that the MRI is cleanly shut off and that the annulus is 
represented in a constant manner allows clearer measure- 
ments of the convergence of the growth rate in the mag- 
netized annulus. The domain is an annulus with radius 
ranging from 0.08 to 0.32 and height 0.0375 units. The 
vertical boundaries are periodic. The inner and outer ra- 
dial boundaries are fixed-value, arranged by preventing 
time evolution for particles with radius greater than 0.3 
or less than 0.1. The initial density is 1.0 everywhere. 
A vertical magnetic field is imposed with radial profile 
B z (r) = B b p {r), where B = 0.0824 and 



M r ) = \ \ + \ tanl1 ' r> sili 



2tt(V-0.05) 
02 



(9) 



which gives a magnetized annulus. The sound speed in 
the magnetized annulus was set to c s — 0.824, and the 
internal energy in the non-magnetized region adjusted so 
that the total pressure (thermal plus magnetic) is con- 
stant. The radial velocity is perturbed with 



V r (z) = lQ~ 5 c s b p {r)cos(2irz/0.l). 



(10) 



Spatially constant resolutions of A = 1/240, A = 1/320, 
and A = 1/400 were used corresponding to A = 1/9Amri, 
A = 1/12 A M ri, A = 1/15A M ri where A M ri = 0.0375 
is the wavelength of the fastest growing MRI mode at 
r = 0.17. 

We then measure the growth of the most unstable 
mode at r = 0.17. Figure 13 shows the radial mag- 
netic field configuration achieved at time 0.94 (or 2.13 
orbits at r = 0.17) for all three resolutions. To calculate 
the mode amplitude M, we use a convolution defined di- 
rectly on the particles, instead of gridding and Fourier 
transforming the data. The motivations are the same as 
when this procedure was used for the Kelvin-Helmholtz 



test. In this case, 



M 



12 (U 



where all sums run over the N points, and 



Si = B r ^ sin 



2tt 



0.0375 



-Zi exp 



Oi - r ) 2 



(11) 



(12) 



c* = B rA cos | zi ) exp [ — ] , (13) 



di = exp 



0.0375 



in -r Q f 



(14) 



The chosen width of the convolution a — 2.7 x 10~ 6 min- 
imizes the radial range that influences the measurement, 
while still giving sufficiently low sampling noise. 

We plot the evolution of the mode amplitude in Fig- 
ure [14] together with the maximum growth rate of 
exp(0.75f2) predicted by a linear perturbation analysis of 
vertical field MRI. The modeled growth rates are reason- 
ably consistent with the prediction from the linear anal- 
ysis for the fastest growi ng mode. Import antly, in the 
context of the findings of Flock et al. (2010), where spu- 
riously high growth rates were observed, we find growth 
rates slightly lower than the theoretical maximum value. 

4. SUMMARY AND DISCUSSION 
4.1. Effective Resolving Power 

To determine if Phurbas can be used for practical com- 
putations, we need to establish some guidelines for its 
relative ability to resolve particular phenomena. This 
should be done cautiously, as different classes of al- 
gorithms have different properties in each flow regime. 
An equivalence or difference between algorithms in one 
regime may not hold across different regimes. In any 
case, it is expected that an unstructured mesh or un- 
structured meshless method will have a lower effective 
resolution than a structured mesh method. This is be- 
cause a given number of resolving elements can represent 
the largest possible set of wavelengths when they are ar- 
ranged in a regular manner. Given these caveats, we can 
compare the test results that we have presented here to 
examples computed with Eulerian, mesh-based schemes. 

The first example is the circularly polarized Alfven 
wave test. The lowest resolution, three-dimensional, 
Athena resu l ts in a rectangular domain published in 
Stone et ah] ( |2008| Fig. 33) correspond to 20 and 39 
cells per wavelength, computed with third order spatial 
reconstruction and HLLD fluxes. The Phurbas results on 
this test at A = 1/8 and A = 1/16 of a wavelength appear 
to roughly equal the accuracy of the 20 and 39 cells per 
wavelength Athena results in the sense that the final am- 
plitude of the wave in the Phurbas results is closer to the 
analytically correct value even though there are fewer res- 
olution elements used per wavelength. This result should 
be interpreted cautiously, as the two codes have different 
and nonlinear numerical dissipation. Nevertheless, this 
can be interpreted to mean that the Phurbas effective 
resolution A is roughly equal to two Athena cells as a 
measure of resolution. (On average, there should be one 
particle in each volume of radius A.) In this sense Phur- 
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Figure 13. Linear phase growth of MRI shown via azimuthal field slices at time 0.94 for resolutions (from top to bottom) of A = 
1/160, 1/240, 1/320, and 1/400. 

havior. For systems where the bulk velocity varies as a 
multiple or large fraction of the wave speeds, this means 
Phurbas can capture the flow with more uniform fidelity 
across the domain. Techniques that add an extra ad- 



LinearTheory Rate 

A=1/6A ME1 
A=1/9\ 1KI 
A=1/12A MIII 
A=1/1BA™ 




Figure 14. Linear phase gr owt h of MRI shown for the mode 
amplitude given by equation at r = 0.17 for the resolutions 
A given in the legend, compared to the growth rate predicted by 
linear theory. 

bas with a third-order polynomial fit is competitive with 
a spatially third-order grid code. 

A possibly more operationally useful comparison can 
be dr awn from the r esults of the linear phase MRI growth 
test. Flock et aL] ( 20101 claim that in the code Pluto 
( Mignone et al.||2007[) with piecewise linear reconstruc- 
tions and an HLLD Riemann solver, 10 cells per wave- 
length are req uired to resolve the growth of MRI. Our 
shows Phurbas requires ~ 9 A per 
to resolve the growth. Thus, for this 
tes t we can say that one cell w 1A. The algorithm used 



test in section 
MRI wavelengt 



3.5 



by Flock et al. ( 2010 ) has a stencil size of five cells, while 
Phurbas can be said to have a stencil size of 2r/ = 4.6A, 
so the same rough proportionality holds between the two 
algorithms when expressed in terms of stencil size. 

4.2. Advantages and Disadvantages 

The main advantage of Phurbas is its Lagrangian na- 
ture. Eulerian codes suffer from numerical dissipation 
that varies with the speed and direction of the flow across 
the grid. Phurbas's formulation cleanly avoids this be- 



vection step to an Eulerian method (e.g. Masset 2000 
Johansen et al. 2009) can only efficiently handle simple 
flow geometries. Moreover, in Phurbas, the time step 
is only dependent on Galilean-invariant quantities. For 
flow with bulk velocities greater than the signal speeds 
that determine the Courant time step limit, Phurbas has 
a significant advantage in the number of time steps that 
must be computed to reach a given physical time. 

The resolution criterion in Phurbas is spatially contin- 
uous, unlike in adaptive mesh refinement techniques, so 
there are no abrupt resolution jumps that can lead to 
undesirable artifacts. Also, due to the Lagrangian and 
meshless nature of the code, refined regions can be arbi- 
trarily shaped and follow the flow with minimal creation 
and deletion of resolution elements. 

As Phurbas formally computes strong solutions to the 
governing partial differential equations using a mcthod- 
of-lines type approach, it is relatively simple, compared 
to Godunov methods for example, to switch to an alter- 
nate set of variables or even alternate equations. The 
central limitation is only that the equations should be 
amenable to the explicit Hermitian time integration used. 
However, changing the time integration scheme would 
not fundamentally alter the method. 

A moving unstructured mesh or meshless method must 
win in terms of the Galilean invariance of the numerical 
diffusion, adaptivity, and the time step advantages, since 
the quality of the spatial derivatives on an unstructured 
set of discretization points is lower than for equivalent 
fits on a grid of points. For problems where the La- 
grangian and adaptive nature of Phurbas is of no signif- 
icant ad vantage, a grid-based method such as the Pen- 
cil Code ( |Brandenburg fc Dobler|2002 ) remains substan- 
tially more efficient. 

4.3. Future Prospects 



12 



Several enhancements to Phurbas can be made, which 
we briefly mention here. 

• The Phurbas discretization is very flexible in how 
it can incorporate governing equations other than 
ideal MHD. Implementing non- ideal MHD is rela- 
tively simple with forward-time-centered-space vis- 
cosity and resistivity operators. 

• Self-gravity can be implemented using the existing 
GADGET-2 tree algorithm and particle masses 
derived from a local Voronoi tessellation. 

• Passive tracer particles and interacting dust parti- 
cles can be added in a straightforward manner, us- 
ing the spatial fitting and time integration methods 
used for gas particles. 

• In the moving least squares procedure the choice 
of least squares error and Cartesian polynomial 
functions for the fitting procedure is not obviously 
the ideal choice, and alternate fitting procedures 
should be explored. These could include those with 
a basis that can be used to minimize the variation 
or oscillation of the fitted function, and fitting the 
magnetic field using a se t of divergence-free basis 
functions ( |McNally||20lT| ). 

• Least-squares minimization itself does not appear 
to be a unique choice, and less computationally 
intensive gradient approximation procedures could 
be used. 

• Non-Lagrangian particle trajectories can be added, 
using similar logic t o the steering of Voronoi cell 



noted in Paper I, this was a contribution equivalent to 
co-authorship. We are indebted to Volker Springel for 
maki ng the source c ode of GADGET-2 publicly avail- 
able ( |Springel 2005). We also thank the authors of 



generating centers in 
reduce the number o 



Springel (2010a). This could 



particle additions and dele 



tions, and hence diffusivity in shear flows. 

• An improved void check algorithm could reduce 
the number of redundant particle creation requests, 
minimizing the potentially expensive step of prun- 
ing of the proposed particle addition list. 

• A diffusion with properties similar to a hyperdif- 
fusion would be of great advantage if applied to 
the density and internal energy fields to smooth 
out the smallest scale structures. The challenge of 
this enhancement is to find an approximation with 
sufficiently robust conservation properties. 

Evidently, there are many possible alterations and exten- 
sions that can be made to Phurbas that will significantly 
alter both the nature of the scheme and the capabilities 
of the code. We believe that Phurbas is not just a new 
method for MHD, but one of the first practical examples 
of a new class of schemes for mathematical modeling of 
similar physical problems. 



Simon CO. Glover wrote the initial version of the rou- 
tine to couple the Phurba MHD module to GADGET- 
2, which lies at the core of Phurbas. The tests shown 
here are of the version of the code incorporating inter- 
polating fits as advised by the anonymous referee. As 



Athen a lor making it publicly available ( Stone et al. 
2008| ) . M.-M.M.L. and C.P.M. acknowledge hospital- 
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Institut fur Theoretische Astrophysik dcr Uni. Heidel- 
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APPENDIX 

DISPERSION RELATION AND MODIFIED WAVES 

The test in |3.1| is a convergence test to the solution of the linearized version of the modified MHD equations that 
Phurbas solves. We derive here the dispersion relation and solutions in the cases used in the convergence test. We 
start with the MHD mass, momentum, and induction equations, with a bulk viscosity term pV(£V • V) added to the 
momentum equation, 



d t p + pV • V = 
pd t V + VP - (V x B)B + pV(CV • V) = 
PlVx(VxB) = 0, 

and for adiabatic gas we also use an energy equation 

P 



d, 



= o. 



(Al) 
(A2) 
(A3) 



(A4) 



In these equations, p is density, V is velocity, P is pressure, B is magnetic field, and 7 is the adiabatic index. 
Linearizing, and taking the viscosity £ as constant we obtain 



d t p + p V ■ V = 
p d t V + VP - (V x B) x B + Po (V(V ■ V) = 
-d t B + V x (V x B) = 

P 7 p N 



dt 



Po Po 



= 



(A5) 
(A6) 
(A7) 

(A8) 



where subscripts indicate background values, and unsubscripted variables are fluctuations. Substituting the form 
exp(j(k • r — tot)) as a dependence for all variables to search for wave solutions gives 



— ujp + pok. • V = 
-up a V + kP - (k x B) x B + C^k(k • V) = 
wB + k x (V x B ) = 
( P 1P 

—U! 

V^O P 

Solving these transformed version of the mass, induction, and equation of state for the perturbed variables gives 

k V 



= 



(A9) 
(A10) 
(All) 

(A12) 



P = Po- 



P = jP - 



UJ 

k- V 



B 



(k-V)B -(k-B )V 



(A13) 
(A14) 
(A15) 



We can then substitute these into the linearized momentum equation to yield 



(k-B ) 2 

Po 



V 



7^ + M 
Po Po 



k- 



(k ■ B ) 

Po 



B (k • V) - 



(k-B )(V-B ) 

Po 



k - c^k(k • v) 



Substituting in the Alfven and sound speeds 
we obtain 



B «B 

Po 



V s = 



IjPo 
Po 



[ W 2 -(k-V A ) 2 ]V = 

{(V s 2 + Vj)k - (k • V A )V A } (k • V) - (k • V A )(V ■ V A )k - Ciwk(k • V). 



(A16) 



(A17) 



(A18) 
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We can now choose a propagation direction of the wave, the direction of the wave vector k. The wave vector k is 
chosen to lie in the x — z plane, at an angle 9 clockwise from the z-axis. This gives 

[lj 2 - k 2 V\ cos 2 0]V = 

{(V s 2 + VZ)k-Z\k\V%cos9} (k- V) -/j 2 V a cos6>(V- V A ) - Ok(k • V). (A19) 

Upon transforming this into a matrix eigenvalue problem, and solving for the eigenvectors V one obtains expressions for 
the eigenvalues, which can be solved for lj. Unfortunately, we have only found analytic solutions for two special cases of 
the propagation direction 9. These cases are used to test the convergence of our scheme. For propagation perpendicular 
to the background magnetic field Bo, 9 = tt/2 and the first eigenvalue corresponds to a magnetoacoustic wave, with 
speed given by *He(w)/fc = yJ\k 2 C, 2 — (Vf + V s 2 )|, and attenuation given by 3m(uj) = ik 2 (/2. For propagation in the 
direction of the background magnetic field Bo, 9 — and the first eigenvalue corresponds to a sound wave, with speed 
given by 9le(w)/fc = y/\k 2 ( 2 /4 — (K 2 )!) an d attenuation given by 3m(w) = ik 2 (,/2. In general, the shear Alfven wave 
eigenmodes are unaffected by the addition of the bulk viscosity, as they do not involve compressive motions. 



